Text to Avatar in Multi-modal Human Computer Interface
نویسندگان
چکیده
In this paper, we present a new text-driven avatar system, which consists of three major components, a text-to-speech (TTS) unit, a speech driven facial animation (SDFA) unit and a text-to-sign language (TTSL) unit. A new visual prosody time control model and an integrated learning framework are proposed to realize synchronization among speech synthesis, face animation and gesture animation, which is crucial for this multi-modal synthesis system. Given meaningful sentences, the text-to-sign language system combined with text-to-speech system produces visual prosody information including gesture animation parameters and timing information for text-to-speech unit. The text-to-speech system produces speech according to that timing information and some prosody rules. At last, speech will be used to drive Mpeg-4 based face animation directly with some rules for face expressions. This paper highlights synergies among audio, visual and gesture technology components. The performance of our system shows that the proposed algorithm is suitable, which greatly improves the realism of multi-model speech synthesis.
منابع مشابه
Getting Closer – Tailored Multi-Modal Human-Computer Interaction
This paper outlines our vision of an advanced multi-modal call center using avatar technology, which adapts content, presentation, and interaction strategy to properties of the caller such as age, gender, and emotional state. User studies on Interactive Voice Response (IVR) systems have shown that these properties could be used effectively to “tailor” services to users who do not maintain perso...
متن کاملArchitecture of a multi-modal dialogue system oriented to multilingual question-answering
In this paper, a proposal of a multi-modal dialogue system oriented to multilingual questionanswering is presented. This system includes the following ways of access: voice, text, avatar, gestures and signs language. The proposal is oriented to the question-answering task as a user interaction mechanism. The proposal here presented is in the first stages of its development phase and the archite...
متن کاملIntelligent Virtual Agent: Creating a Multi-modal 3D Avatar Interface
Human-computer interactions can be greatly enhanced by the use of 3D avatars, representing both human users and computer systems in 3D virtual spaces. This allows the human user to interface with the computer system in a natural and intuitive human-to-human dialog (human face-to-face conversation). Hence, continuing to blur the boundaries between the real and virtual worlds. This proposed avata...
متن کاملA Multi-Modal System Intellectual Computer AssistaNt
The paper describes a multi-modal system ICANDO (an Intellectual Computer AssistaNt for Disabled Operators) developed by Speech Informatics Group of SPIIRAS and intended for assistance to the persons without hands or with disabilities of their hands or arms in human-computer interaction. This system combines the modules for automatic speech recognition and head tracking in one multi-modal syste...
متن کاملMulti-modal Aided Presentation of Learning Information: a Usability Comparative Study
This paper presents a comparative two-group experimental study to explore if the addition of multimodal interaction metaphors would enhance the usability of e-learning interfaces. Two independent groups of users were involved in the experiment each of which tested one of the two interface versions provided by the experimental e-learning tool. The first interface was based on textual approach in...
متن کامل